Genotype calling and haplotyping in parent-offspring trios.

نویسندگان

  • Wei Chen
  • Bingshan Li
  • Zhen Zeng
  • Serena Sanna
  • Carlo Sidore
  • Fabio Busonero
  • Hyun Min Kang
  • Yun Li
  • Gonçalo R Abecasis
چکیده

Emerging sequencing technologies allow common and rare variants to be systematically assayed across the human genome in many individuals. In order to improve variant detection and genotype calling, raw sequence data are typically examined across many individuals. Here, we describe a method for genotype calling in settings where sequence data are available for unrelated individuals and parent-offspring trios and show that modeling trio information can greatly increase the accuracy of inferred genotypes and haplotypes, especially on low to modest depth sequencing data. Our method considers both linkage disequilibrium (LD) patterns and the constraints imposed by family structure when assigning individual genotypes and haplotypes. Using simulations, we show that trios provide higher genotype calling accuracy across the frequency spectrum, both overall and at hard-to-call heterozygous sites. In addition, trios provide greatly improved phasing accuracy--improving the accuracy of downstream analyses (such as genotype imputation) that rely on phased haplotypes. To further evaluate our approach, we analyzed data on the first 508 individuals sequenced by the SardiNIA sequencing project. Our results show that our method reduces the genotyping error rate by 50% compared with analysis using existing methods that ignore family structure. We anticipate our method will facilitate genotype calling and haplotype inference for many ongoing sequencing projects.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Imputation of parent-offspring trios and their effect on accuracy of genomic prediction using Bayesian method

The objective of this study was to evaluate the imputation accuracy of parent-offspring trios under different scenarios. By using simulated datasets, the performance Bayesian LASSO in genomic prediction was also examined. The genome consisted of 5 chromosomes and each chromosome was set as 1 Morgan length. The number of SNPs per chromosome was 10000. One hundred QTLs were randomly distributed a...

متن کامل

A statistical variant calling approach from pedigree information and local haplotyping with phase informative reads

MOTIVATION Variant calling from genome-wide sequencing data is essential for the analysis of disease-causing mutations and elucidation of disease mechanisms. However, variant calling in low coverage regions is difficult due to sequence read errors and mapping errors. Hence, variant calling approaches that are robust to low coverage data are demanded. RESULTS We propose a new variant calling a...

متن کامل

Comparison of three boosting methods in parent-offspring trios for genotype imputation using simulation study

BACKGROUND Genotype imputation is an important process of predicting unknown genotypes, which uses reference population with dense genotypes to predict missing genotypes for both human and animal genetic variations at a low cost. Machine learning methods specially boosting methods have been used in genetic studies to explore the underlying genetic profile of disease and build models capable of ...

متن کامل

Association between C677T and A1298C MTHFR gene polymorphism and nonsyndromic orofacial clefts in the Turkish population: a case-parent study.

Two common MTHFR gene polymorphisms (C677T and A1298C) have been implicated in the etiology of nonsyndromic cleft lip/palate (nsCL/P). To investigate the genotype association among nsCL/P in the Turkish population, 56 case-parent trios were recruited into the study. Genotype frequencies were compared to two groups of controls from the same population. A total of 46 case-parent trios were includ...

متن کامل

A novel method for detecting uniparental disomy from trio genotypes identifies a significant excess in children with developmental disorders.

Exome sequencing of parent-offspring trios is a popular strategy for identifying causative genetic variants in children with rare diseases. This method owes its strength to the leveraging of inheritance information, which facilitates de novo variant calling, inference of compound heterozygosity, and the identification of inheritance anomalies. Uniparental disomy describes the inheritance of a h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genome research

دوره 23 1  شماره 

صفحات  -

تاریخ انتشار 2013